Welcome back everybody. So we have today our last lecture in deep learning and we have
a full schedule because there's a lot of things that need to be done. Henrik, could you help
me with the door? So we have still a couple of topics that we need to finish up. One of
that is the last few slides on object segmentation. So this is what we'll do in the first part
of the lecture. Then we'll also talk about weekly supervised learning and finally about
how to embed prior knowledge into neural networks. And then we will present possible research
topics if you're interested in continuing your deep learning efforts into the projects.
And we will display them. And in the very end we will give a short summary of the lecture.
Okay, so let's continue with, sorry, you've already seen this. We have to skip ahead to
the advanced topics. So this was a short refresher, right? Now everybody has seen the slides and
you're all following that we're talking about segmentation. So let's continue with the advanced
topics. And there's only a couple of slides here. So for example, what you can do is you
can also use an adversarial network as a kind of loss. This is then often also called adversarial
loss. So what you do have here in addition is you have the, essentially the segment,
so the input image, a segmentation network, and then some class predictions. And then
you train an adversarial network that gets input images and corresponding segmentations
or ground truth segmentations. And you play this adversarial game. You remember, right,
that the adversarial network should decide whether this is a ground truth segmentation
or a segmentation made by the network. And this way you can also construct a particular
loss that will then try to generate more realistic segmentations. So oftentimes from automatic
segmentation you get errors that look not very plausible at all. And with the adversarial
network the idea is to make them look more realistically. And then what you typically
do is you don't use this loss alone, but you embed this into a combined loss function where
you have essentially a multi-class loss. Then you have the adversarial loss and a binary
classification. And with this you can build such composite loss functions and then train
your segmentation network in order to produce more realistically looking segmentations.
And obviously the multitask learning also helps here with this adversarial task. What
else? There's of course the topic of instant segmentation. So in instant segmentation you're
not just doing a semantic segmentation like detecting which pixel belongs to which class.
Here we have bottle, cube, and cup. But here multiple cubes in the image. And in instant
segmentation you're interested actually in detecting that there are different cubes in
this image. You want to have cube one, cube two, cube three, and they are all different
instances. So this is why this is called instant segmentation. So it's a segmentation task
combined with a kind of object detection task. And in order to realize that you do essentially
a very similar approach you combine an object detector with a segmentation. And there's
actually several examples in literature. Deep mask, sharp mask, and probably most popular
is mask RCNN that is also available for reuse. And what you do then is essentially you combine
a detector and a pixel-wise segmentation network. And this helps you to resolve which instance
to separate the instances from the segmentation mask. So here we can separate different persons
in this image. And we can also detect like the bus here in the background. And even in
cases where you have different persons riding an animal, like in this case the segmentation
helps you, can be guided by the object detector. So how would you construct that? So you combine
your object detector essentially with a pixel-wise segmentation. So this then gives you a multitask
loss where you have a class loss, a box loss, sorry, where you have this, the class box
loss and the class loss. And you combine that into a multitask loss. And this way you can
train this. But we are only talking very superficially about this so we don't go into the architecture
because we are slightly running out of time today. But I really recommend to have a look
at the mask RCNN paper which describes this very nicely. Okay, so these are some examples.
You can see that the simultaneous object detection and segmentation actually works fairly well.
So this is also a very popular method right now for visual detection tasks and results
Presenters
Zugänglich über
Offener Zugang
Dauer
01:02:58 Min
Aufnahmedatum
2019-07-25
Hochgeladen am
2019-07-25 16:09:03
Sprache
en-US